Overview

Dataset statistics

Number of variables21
Number of observations1000
Missing cells171
Missing cells (%)0.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory396.7 KiB
Average record size in memory406.2 B

Variable types

NUM12
BOOL4
CAT4
DATE1

Reproduction

Analysis started2020-08-20 12:56:35.177499
Analysis finished2020-08-20 12:57:05.488512
Duration30.31 seconds
Versionpandas-profiling v2.7.1
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Store is highly correlated with df_indexHigh correlation
df_index is highly correlated with StoreHigh correlation
Customers is highly correlated with SalesHigh correlation
Sales is highly correlated with CustomersHigh correlation
sales_per_customer has 171 (17.1%) missing values Missing
df_index has unique values Unique
Sales has 171 (17.1%) zeros Zeros
Customers has 171 (17.1%) zeros Zeros

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE
Distinct count1000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean503563.579
Minimum827
Maximum1015129
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum827
5-th percentile59821.25
Q1260922.75
median488470
Q3752031
95-th percentile960340.75
Maximum1015129
Range1014302
Interquartile range (IQR)491108.25

Descriptive statistics

Standard deviation285526.0505
Coefficient of variation (CV)0.5670109245
Kurtosis-1.155248824
Mean503563.579
Median Absolute Deviation (MAD)241878
Skewness0.05046897516
Sum503563579
Variance8.152512549e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
485374 1 0.1%
 
920241 1 0.1%
 
740100 1 0.1%
 
152261 1 0.1%
 
572100 1 0.1%
 
1002503 1 0.1%
 
431700 1 0.1%
 
246462 1 0.1%
 
469692 1 0.1%
 
919525 1 0.1%
 
Other values (990) 990 99.0%
 
ValueCountFrequency (%) 
827 1 0.1%
 
1441 1 0.1%
 
1519 1 0.1%
 
2117 1 0.1%
 
3649 1 0.1%
 
ValueCountFrequency (%) 
1015129 1 0.1%
 
1013951 1 0.1%
 
1012717 1 0.1%
 
1011016 1 0.1%
 
1010994 1 0.1%
 

Store
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count665
Unique (%)66.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean552.904
Minimum1
Maximum1113
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum1
5-th percentile65.95
Q1288
median536
Q3826.25
95-th percentile1053.1
Maximum1113
Range1112
Interquartile range (IQR)538.25

Descriptive statistics

Standard deviation312.9912816
Coefficient of variation (CV)0.5660861226
Kurtosis-1.15569261
Mean552.904
Median Absolute Deviation (MAD)264.5
Skewness0.04978727165
Sum552904
Variance97963.54233
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
281 5 0.5%
 
211 4 0.4%
 
678 4 0.4%
 
880 4 0.4%
 
291 4 0.4%
 
271 4 0.4%
 
268 4 0.4%
 
943 4 0.4%
 
441 4 0.4%
 
527 4 0.4%
 
Other values (655) 959 95.9%
 
ValueCountFrequency (%) 
1 1 0.1%
 
2 2 0.2%
 
3 1 0.1%
 
4 1 0.1%
 
6 1 0.1%
 
ValueCountFrequency (%) 
1113 1 0.1%
 
1112 1 0.1%
 
1111 1 0.1%
 
1109 2 0.2%
 
1107 2 0.2%
 

DayOfWeek
Real number (ℝ≥0)

Distinct count7
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.991
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.969466374
Coefficient of variation (CV)0.4934769165
Kurtosis-1.186803223
Mean3.991
Median Absolute Deviation (MAD)2
Skewness0.0401277736
Sum3991
Variance3.878797798
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4 160 16.0%
 
3 157 15.7%
 
7 146 14.6%
 
2 139 13.9%
 
5 138 13.8%
 
1 134 13.4%
 
6 126 12.6%
 
ValueCountFrequency (%) 
1 134 13.4%
 
2 139 13.9%
 
3 157 15.7%
 
4 160 16.0%
 
5 138 13.8%
 
ValueCountFrequency (%) 
7 146 14.6%
 
6 126 12.6%
 
5 138 13.8%
 
4 160 16.0%
 
3 157 15.7%
 

Date
Date

Distinct count615
Unique (%)61.5%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2013-01-01 00:00:00
Maximum2015-07-31 00:00:00
Histogram

Sales
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count790
Unique (%)79.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5651.73
Minimum0
Maximum16331
Zeros171
Zeros (%)17.1%
Memory size7.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q13882.75
median5745.5
Q37848.5
95-th percentile11562.75
Maximum16331
Range16331
Interquartile range (IQR)3965.75

Descriptive statistics

Standard deviation3515.341922
Coefficient of variation (CV)0.6219939598
Kurtosis-0.2589751912
Mean5651.73
Median Absolute Deviation (MAD)1969.5
Skewness0.05063180166
Sum5651730
Variance12357628.83
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 171 17.1%
 
7135 3 0.3%
 
5432 3 0.3%
 
8383 3 0.3%
 
3863 3 0.3%
 
7303 2 0.2%
 
9565 2 0.2%
 
6020 2 0.2%
 
5703 2 0.2%
 
5178 2 0.2%
 
Other values (780) 807 80.7%
 
ValueCountFrequency (%) 
0 171 17.1%
 
1032 1 0.1%
 
1685 1 0.1%
 
1736 1 0.1%
 
1847 1 0.1%
 
ValueCountFrequency (%) 
16331 1 0.1%
 
15537 1 0.1%
 
15535 1 0.1%
 
15171 1 0.1%
 
15077 1 0.1%
 

Customers
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count542
Unique (%)54.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean605.101
Minimum0
Maximum1868
Zeros171
Zeros (%)17.1%
Memory size7.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1409.5
median622.5
Q3817
95-th percentile1292.1
Maximum1868
Range1868
Interquartile range (IQR)407.5

Descriptive statistics

Standard deviation383.820561
Coefficient of variation (CV)0.6343082576
Kurtosis0.1920091675
Mean605.101
Median Absolute Deviation (MAD)204.5
Skewness0.2422526649
Sum605101
Variance147318.223
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 171 17.1%
 
523 6 0.6%
 
635 6 0.6%
 
491 5 0.5%
 
641 5 0.5%
 
501 5 0.5%
 
700 5 0.5%
 
643 4 0.4%
 
668 4 0.4%
 
717 4 0.4%
 
Other values (532) 785 78.5%
 
ValueCountFrequency (%) 
0 171 17.1%
 
176 1 0.1%
 
182 1 0.1%
 
209 1 0.1%
 
210 1 0.1%
 
ValueCountFrequency (%) 
1868 1 0.1%
 
1842 1 0.1%
 
1841 1 0.1%
 
1814 1 0.1%
 
1795 1 0.1%
 

Open
Boolean

Distinct count2
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
829
0
171
ValueCountFrequency (%) 
1 829 82.9%
 
0 171 17.1%
 

Promo
Boolean

Distinct count2
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
0
609
1
391
ValueCountFrequency (%) 
0 609 60.9%
 
1 391 39.1%
 

StateHoliday
Categorical

Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
o
973
a
 
19
c
 
6
b
 
2
ValueCountFrequency (%) 
o 973 97.3%
 
a 19 1.9%
 
c 6 0.6%
 
b 2 0.2%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Lowercase_Letter 4 100.0%
 
ValueCountFrequency (%) 
Latin 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 
Distinct count2
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
0
822
1
178
ValueCountFrequency (%) 
0 822 82.2%
 
1 178 17.8%
 

StoreType
Categorical

Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
a
546
d
326
c
117
b
 
11
ValueCountFrequency (%) 
a 546 54.6%
 
d 326 32.6%
 
c 117 11.7%
 
b 11 1.1%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Lowercase_Letter 4 100.0%
 
ValueCountFrequency (%) 
Latin 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 

Assortment
Categorical

Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
a
514
c
479
b
 
7
ValueCountFrequency (%) 
a 514 51.4%
 
c 479 47.9%
 
b 7 0.7%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Lowercase_Letter 3 100.0%
 
ValueCountFrequency (%) 
Latin 3 100.0%
 
ValueCountFrequency (%) 
ASCII 3 100.0%
 

CompetitionDistance
Real number (ℝ≥0)

Distinct count461
Unique (%)46.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4927.62
Minimum30.0
Maximum27650.0
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum30
5-th percentile139.5
Q1760
median2470
Q36892.5
95-th percentile18540
Maximum27650
Range27620
Interquartile range (IQR)6132.5

Descriptive statistics

Standard deviation5813.044561
Coefficient of variation (CV)1.179686048
Kurtosis1.830858064
Mean4927.62
Median Absolute Deviation (MAD)2060
Skewness1.596393661
Sum4927620
Variance33791487.07
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
250 11 1.1%
 
350 8 0.8%
 
2710 8 0.8%
 
50 7 0.7%
 
550 7 0.7%
 
230 7 0.7%
 
1070 7 0.7%
 
2190 7 0.7%
 
1080 7 0.7%
 
190 7 0.7%
 
Other values (451) 924 92.4%
 
ValueCountFrequency (%) 
30 5 0.5%
 
40 5 0.5%
 
50 7 0.7%
 
60 3 0.3%
 
70 4 0.4%
 
ValueCountFrequency (%) 
27650 1 0.1%
 
27530 2 0.2%
 
27150 1 0.1%
 
26490 1 0.1%
 
26130 1 0.1%
 

CompetitionOpenSinceMonth
Real number (ℝ≥0)

Distinct count12
Unique (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.827
Minimum1.0
Maximum12.0
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum1
5-th percentile2.95
Q16
median9
Q39
95-th percentile11
Maximum12
Range11
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.691154643
Coefficient of variation (CV)0.3438296465
Kurtosis-0.2322506994
Mean7.827
Median Absolute Deviation (MAD)1
Skewness-0.8127326447
Sum7827
Variance7.242313313
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
9 465 46.5%
 
4 79 7.9%
 
11 77 7.7%
 
7 64 6.4%
 
10 51 5.1%
 
3 50 5.0%
 
12 48 4.8%
 
6 46 4.6%
 
5 42 4.2%
 
2 35 3.5%
 
Other values (2) 43 4.3%
 
ValueCountFrequency (%) 
1 15 1.5%
 
2 35 3.5%
 
3 50 5.0%
 
4 79 7.9%
 
5 42 4.2%
 
ValueCountFrequency (%) 
12 48 4.8%
 
11 77 7.7%
 
10 51 5.1%
 
9 465 46.5%
 
8 28 2.8%
 

CompetitionOpenSinceYear
Real number (ℝ≥0)

Distinct count20
Unique (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.496
Minimum1900.0
Maximum2015.0
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum1900
5-th percentile2002
Q12008
median2011
Q32011
95-th percentile2014
Maximum2015
Range115
Interquartile range (IQR)3

Descriptive statistics

Standard deviation5.378746593
Coefficient of variation (CV)0.002676664494
Kurtosis184.3885348
Mean2009.496
Median Absolute Deviation (MAD)1
Skewness-10.28430612
Sum2009496
Variance28.93091491
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2011 398 39.8%
 
2014 70 7.0%
 
2012 64 6.4%
 
2013 57 5.7%
 
2007 56 5.6%
 
2005 56 5.6%
 
2010 54 5.4%
 
2009 52 5.2%
 
2006 45 4.5%
 
2008 36 3.6%
 
Other values (10) 112 11.2%
 
ValueCountFrequency (%) 
1900 1 0.1%
 
1961 2 0.2%
 
1990 4 0.4%
 
1999 2 0.2%
 
2000 7 0.7%
 
ValueCountFrequency (%) 
2015 30 3.0%
 
2014 70 7.0%
 
2013 57 5.7%
 
2012 64 6.4%
 
2011 398 39.8%
 

Promo2
Boolean

Distinct count2
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
518
0
482
ValueCountFrequency (%) 
1 518 51.8%
 
0 482 48.2%
 

Promo2SinceWeek
Real number (ℝ≥0)

Distinct count21
Unique (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.415
Minimum1.0
Maximum50.0
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum1
5-th percentile5
Q114
median14
Q318
95-th percentile40
Maximum50
Range49
Interquartile range (IQR)4

Descriptive statistics

Standard deviation10.99143242
Coefficient of variation (CV)0.5968738755
Kurtosis0.4139670615
Mean18.415
Median Absolute Deviation (MAD)0
Skewness1.176794421
Sum18415
Variance120.8115866
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
14 562 56.2%
 
40 55 5.5%
 
10 38 3.8%
 
5 36 3.6%
 
37 36 3.6%
 
1 35 3.5%
 
18 32 3.2%
 
31 31 3.1%
 
22 31 3.1%
 
13 29 2.9%
 
Other values (11) 115 11.5%
 
ValueCountFrequency (%) 
1 35 3.5%
 
5 36 3.6%
 
6 1 0.1%
 
9 20 2.0%
 
10 38 3.8%
 
ValueCountFrequency (%) 
50 1 0.1%
 
48 11 1.1%
 
45 24 2.4%
 
44 5 0.5%
 
40 55 5.5%
 

Promo2SinceYear
Real number (ℝ≥0)

Distinct count7
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2011.355
Minimum2009.0
Maximum2015.0
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum2009
5-th percentile2009
Q12011
median2011
Q32012
95-th percentile2014
Maximum2015
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.215916713
Coefficient of variation (CV)0.0006045261593
Kurtosis0.8004152121
Mean2011.355
Median Absolute Deviation (MAD)0
Skewness0.7550592501
Sum2011355
Variance1.478453453
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2011 622 62.2%
 
2013 103 10.3%
 
2012 73 7.3%
 
2014 71 7.1%
 
2009 61 6.1%
 
2010 59 5.9%
 
2015 11 1.1%
 
ValueCountFrequency (%) 
2009 61 6.1%
 
2010 59 5.9%
 
2011 622 62.2%
 
2012 73 7.3%
 
2013 103 10.3%
 
ValueCountFrequency (%) 
2015 11 1.1%
 
2014 71 7.1%
 
2013 103 10.3%
 
2012 73 7.3%
 
2011 622 62.2%
 

PromoInterval
Categorical

Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Jan,Apr,Jul,Oct
777
Feb,May,Aug,Nov
 
123
Mar,Jun,Sept,Dec
 
100
ValueCountFrequency (%) 
Jan,Apr,Jul,Oct 777 77.7%
 
Feb,May,Aug,Nov 123 12.3%
 
Mar,Jun,Sept,Dec 100 10.0%
 

Length

Max length16
Mean length15.1
Min length15
ValueCountFrequency (%) 
Lowercase_Letter 14 60.9%
 
Uppercase_Letter 8 34.8%
 
Other_Punctuation 1 4.3%
 
ValueCountFrequency (%) 
Latin 22 95.7%
 
Common 1 4.3%
 
ValueCountFrequency (%) 
ASCII 23 100.0%
 

month
Real number (ℝ≥0)

Distinct count12
Unique (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.821
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.408897121
Coefficient of variation (CV)0.5856205327
Kurtosis-1.042538365
Mean5.821
Median Absolute Deviation (MAD)3
Skewness0.2833487739
Sum5821
Variance11.62057958
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 112 11.2%
 
2 105 10.5%
 
7 103 10.3%
 
6 100 10.0%
 
5 93 9.3%
 
3 92 9.2%
 
4 91 9.1%
 
12 72 7.2%
 
8 62 6.2%
 
11 61 6.1%
 
Other values (2) 109 10.9%
 
ValueCountFrequency (%) 
1 112 11.2%
 
2 105 10.5%
 
3 92 9.2%
 
4 91 9.1%
 
5 93 9.3%
 
ValueCountFrequency (%) 
12 72 7.2%
 
11 61 6.1%
 
10 61 6.1%
 
9 48 4.8%
 
8 62 6.2%
 

sales_per_customer
Real number (ℝ≥0)

MISSING
Distinct count829
Unique (%)100.0%
Missing171
Missing (%)17.1%
Infinite0
Infinite (%)0.0%
Mean9.601078294965776
Minimum3.6750871080139373
Maximum17.48175182481752
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum3.675087108
5-th percentile6.532093883
Q17.982791587
median9.411067194
Q310.95709571
95-th percentile13.74064806
Maximum17.48175182
Range13.80666472
Interquartile range (IQR)2.974304123

Descriptive statistics

Standard deviation2.218568376
Coefficient of variation (CV)0.2310749176
Kurtosis0.1609142098
Mean9.601078295
Median Absolute Deviation (MAD)1.463609567
Skewness0.5190732953
Sum7959.293907
Variance4.922045639
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
11.39563438 1 0.1%
 
7.984168865 1 0.1%
 
10.88076312 1 0.1%
 
6.773491592 1 0.1%
 
10.56507937 1 0.1%
 
9.438998957 1 0.1%
 
8.317200298 1 0.1%
 
9.532544379 1 0.1%
 
11.65037594 1 0.1%
 
7.7 1 0.1%
 
Other values (819) 819 81.9%
 
(Missing) 171 17.1%
 
ValueCountFrequency (%) 
3.675087108 1 0.1%
 
4.028846154 1 0.1%
 
4.11010453 1 0.1%
 
4.37366167 1 0.1%
 
4.780044101 1 0.1%
 
ValueCountFrequency (%) 
17.48175182 1 0.1%
 
17.26457399 1 0.1%
 
16.81946625 1 0.1%
 
16.39049236 1 0.1%
 
15.83527132 1 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

df_indexStoreDayOfWeekDateSalesCustomersOpenPromoStateHolidaySchoolHolidayStoreTypeAssortmentCompetitionDistanceCompetitionOpenSinceMonthCompetitionOpenSinceYearPromo2Promo2SinceWeekPromo2SinceYearPromoIntervalmonthsales_per_customer
01010994110962015-01-17332831510o0ca3490.04.02011.0122.02012.0Jan,Apr,Jul,Oct110.565079
143331747572014-06-010000o0aa140.09.02005.0014.02011.0Jan,Apr,Jul,Oct6NaN
228100431062014-12-13973980910o0ac2290.09.02011.0110.02014.0Mar,Jun,Sept,Dec1212.038319
331642434832013-12-181222590611o0aa16490.09.02011.0122.02012.0Jan,Apr,Jul,Oct1213.493377
431596334772013-02-240000o0dc9360.07.02013.0122.02012.0Mar,Jun,Sept,Dec2NaN
527764530632013-10-30415146410o1aa5100.04.02007.0140.02014.0Jan,Apr,Jul,Oct108.946121
654490359832015-07-22488667310o0ca550.012.02013.0140.02014.0Jan,Apr,Jul,Oct77.260030
725117927772015-07-120000o0dc7840.09.02011.0131.02009.0Feb,May,Aug,Nov7NaN
8966281106052015-02-13743385510o0ac3430.09.02011.0131.02013.0Feb,May,Aug,Nov28.693567
965525072072013-06-230000o0ac15320.03.02011.0114.02013.0Feb,May,Aug,Nov6NaN

Last rows

df_indexStoreDayOfWeekDateSalesCustomersOpenPromoStateHolidaySchoolHolidayStoreTypeAssortmentCompetitionDistanceCompetitionOpenSinceMonthCompetitionOpenSinceYearPromo2Promo2SinceWeekPromo2SinceYearPromoIntervalmonthsales_per_customer
99029276932252013-03-08586259711o0aa17500.04.02001.0137.02009.0Jan,Apr,Jul,Oct39.819095
99163319169522013-06-04653464411o0aa550.07.02011.011.02012.0Jan,Apr,Jul,Oct610.145963
99268871575722015-04-07692065410o1ac3450.09.02011.0014.02011.0Jan,Apr,Jul,Oct410.581040
99351745956872015-05-030000o0dc4270.09.02011.011.02013.0Jan,Apr,Jul,Oct5NaN
99446349950812013-05-13790572111o0ac1280.09.02011.0140.02011.0Jan,Apr,Jul,Oct510.963939
99570200877152013-12-2010197104311o0aa20640.09.02007.0014.02011.0Jan,Apr,Jul,Oct129.776606
99662358268542014-06-12559257010o0aa650.011.02013.0137.02009.0Jan,Apr,Jul,Oct69.810526
99782673890832015-03-25308030710o0aa1980.07.02010.0137.02009.0Jan,Apr,Jul,Oct310.032573
99841191545262014-09-13304529210o0ac1850.08.02013.015.02011.0Feb,May,Aug,Nov910.428082
999723288062015-04-041057884410o0da7910.09.02011.0014.02011.0Jan,Apr,Jul,Oct412.533175